Human-Centric Data Cleaning [Vision]

نویسندگان

  • El Kindi Rezig
  • Mourad Ouzzani
  • Ahmed K. Elmagarmid
  • Walid G. Aref
چکیده

Data Cleaning refers to the process of detecting and fixing errors in the data. Human involvement is instrumental at several stages of this process, e.g., to identify and repair errors, to validate computed repairs, etc. There is currently a plethora of data cleaning algorithms addressing a wide range of data errors (e.g., detecting duplicates, violations of integrity constraints, missing values, etc.). Many of these algorithms involve a human in the loop, however, this latter is usually coupled to the underlying cleaning algorithms. There is currently no end-to-end data cleaning framework that systematically involves humans in the cleaning pipeline regardless of the underlying cleaning algorithms. In this paper, we highlight key challenges that need to be addressed to realize such a framework. We present a design vision and discuss scenarios that motivate the need for such a framework to judiciously assist humans in the cleaning process. Finally, we present directions to implement such a framework.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NEIBank: Genomics and bioinformatics resources for vision research

NEIBank is an integrated resource for genomics and bioinformatics in vision research. It includes expressed sequence tag (EST) data and sequence-verified cDNA clones for multiple eye tissues of several species, web-based access to human eye-specific SAGE data through EyeSAGE, and comprehensive, annotated databases of known human eye disease genes and candidate disease gene loci. All expression-...

متن کامل

Software Engineering Education for the Pervasive Human-Centric Computing Era

Moving away from decades of machine-centric computing and making pervasive human-centric computing, the new wave of computing, a reality revolutionises the relationship between humans and computing systems. In order to implement the vision of pervasive human-centric computing, it is necessary to reform software engineering education to well prepare graduates of software engineering programmes f...

متن کامل

When sensing goes pervasive

In line with the pervasive vision, pervasive sensing allows the provision of ubiquitous and pervasive monitoring and heterogenous data collection. In the past decade, two dominant pervasive sensing paradigms have emerged: a mostly human-free paradigm centered around wireless sensor networks and a human-centric paradigm fueled by the rise of personal smart devices (smartphones and wearables). In...

متن کامل

A 2D + 3D Rich Data Approach to Scene Understanding

On your one-minute walk from the coffee machine to your desk each morning, you pass by dozens of scenes – a kitchen, an elevator, your office – and you effortlessly recognize them and perceive their 3D structure. But this one-minute scene-understanding problem has been an open challenge in computer vision since the field was first established 50 years ago. In this dissertation, we aim to rethin...

متن کامل

Ubiquitous Knowledge Bases for the Semantic Web of Things

1. INTRODUCTION The Semantic Web of Things (SWoT) is an emerging vision in Information and Communication Technology, joining the Semantic Web and the Internet of Things. The Semantic Web initiative (Berners-Lee, Hendler & Lassila, 2001) aims to allow software agents to share, reuse and combine information available in the World Wide Web. The Internet of Things vision (International Telecommunic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1712.08971  شماره 

صفحات  -

تاریخ انتشار 2017